Teaching authentic data science without prerequisites

Matt Beckman
Penn State University

Danny Kaplan
Macalester College

U.S. Conference on Teaching Statistics
University Park, PA
May 20, 2017

Background: Introduction to R (Penn State)

Structure

Students

Thoughts before, during, and after course

Background: Data Computing (Macalester)

Tools: Working code

Tools: RMarkdown

Tools: Other Resources

Sample Activities

https://mdbeckman.github.io/USCOTS2017Breakout/

PSU Final Project

Instructions were intentionally somewhat vague:

2013 FBI Crime Reporting

Movies in the 21st Century

FIFA World Rankings Analysis

Stanley Cup Winners

Vegetarian Restaurant Analysis

Student Code

data1<-unique(data) 
data2<-na.omit(data1)

RestaurantMap <-
 leaflet(data2) %>%
 addTiles() %>%
 addCircleMarkers(radius = 2, color = "red") %>%
 setView( lng =-73.935242, lat =40.730610, zoom = 12)  #New York

RestaurantMap

Vegetarian Restaurant Analysis (Static)

Other interesting projects

History of Reddit

Leading Causes of Death in NYC

Analysis of Thanksgiving

Student Outcomes: Core Skills

Student Outcomes: Broad Exposure

Where to go from here?

Feel free to revisit some of your initial thoughts about “authentic data science”

…“without prerequisites”

…or maybe consider some new questions